Synchronous Rewriting in Treebanks

نویسندگان

  • Laura Kallmeyer
  • Wolfgang Maier
  • Giorgio Satta
چکیده

Several formalisms have been proposed for modeling trees with discontinuous phrases. Some of these formalisms allow for synchronous rewriting. However, it is unclear whether synchronous rewriting is a necessary feature. This is an important question, since synchronous rewriting greatly increases parsing complexity. We present a characterization of recursive synchronous rewriting in constituent treebanks with discontinuous annotation. An empirical investigation reveals that synchronous rewriting is actually a necessary feature. Furthermore, we transfer this property to grammars extracted from treebanks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semgrex-Plus: a Tool for Automatic Dependency-Graph Rewriting

This paper describes an automatic procedure, the Semgrex-Plus tool, we developed to convert dependency treebanks into different formats. It allows for the definition of formal rules for rewriting dependencies and token tags as well as an algorithm for treebank rewriting able to avoid rule interference during the conversion process. This tool is publicly available1.

متن کامل

Direct Parsing of Discontinuous Constituents in German

Discontinuities occur especially frequently in languages with a relatively free word order, such as German. Generally, due to the longdistance dependencies they induce, they lie beyond the expressivity of Probabilistic CFG, i.e., they cannot be directly reconstructed by a PCFG parser. In this paper, we use a parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS), a formalism wi...

متن کامل

Exploiting Multiple Treebanks for Parsing with Quasi-synchronous Grammars

We present a simple and effective framework for exploiting multiple monolingual treebanks with different annotation guidelines for parsing. Several types of transformation patterns (TP) are designed to capture the systematic annotation inconsistencies among different treebanks. Based on such TPs, we design quasisynchronous grammar features to augment the baseline parsing models. Our approach ca...

متن کامل

Synchronous Models of Language

In synchronous rewriting, the productions of two rewriting systems are paired and applied synchronously in the derivation of a pair of strings. We present a new synchronous rewriting system and argue that it can handle certain phenomena that are not covered by existing synchronous systems. We also prove some interesting formal/computational properties of our system.

متن کامل

Enhanced UD Dependencies with Neutralized Diathesis Alternation

The 2.0 release of the Universal Dependency treebanks demonstrates the effectiveness of the UD scheme to cope with very diverse languages. The next step would be to get more of syntactic analysis, and the “enhanced dependencies” sketched in the UD 2.0 guidelines is a promising attempt in that direction. In this work we propose to go further and enrich the enhanced dependency scheme along two ax...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009